46 research outputs found

    A spherical microphone array based system for immersive audio scene rendering

    Get PDF
    For many applications it is necessary to capture an acoustic field and present it for human listeners, creating the same acoustic perception for them as if they were actually present in the scene. Possible applications of this technique include entertainment, education, military training, remote telepresence, surveillance, and others. Recently, there is much interest on the use of spherical microphone arrays in acoustic scene capture and reproduction application. We describe a 32-microphone spherical array based system implemented for spatial audio capture and reproduction. The array embeds hardware that is traditionally external, such as preamplifiers, filters, digital-to-analog converters, and USB interface adapter, resulting in a portable lightweight solution and requiring no hardware on PC side whatsoever other than a high-speed USB port. We provide capability analysis of the array and describe software suite developed for the application

    Spherical Microphone Array Based Immersive Audio Scene Rendering

    Get PDF
    Presented at the 14th International Conference on Auditory Display (ICAD2008) on June 24-27, 2008 in Paris, France.In many applications such as entertainment, education, military training, remote telepresence, surveillance, etc. it is necessary to capture an acoustic field and present it to listeners with a goal of creating the same acoustic perception for them as if they were actually present at the scene. Currently, there is much interest in the use of spherical microphone arrays for acoustic scene capture and reproduction. We describe a 32-microphone spherical array based system implemented for spatial audio capture and reproduction. Our array embeds hardware that is traditionally external, such as preamplifiers, filters, digital-to-analog converters, and USB adaptor, resulting in a portable lightweight solution and requiring no hardware on the PC side whatsoever other than a high-speed USB port. We provide capability analysis of the array and describe software suite developed for the application

    Customizable auditory displays

    Get PDF
    Presented at the 8th International Conference on Auditory Display (ICAD), Kyoto, Japan, July 2-5, 2002.High-quality virtual audio scene rendering is a must for emerging virtual and augmented reality applications, for perceptual user interfaces,and sonification of data. Personalization of HRTF is necessary in applications where perceptual realism and correct elevation perception is critical. We describe algorithms for creation of virtual auditory spaces by rendering cues that arise from anatomical scattering, environmental scattering, and dynamical effects. We use a novel way of personalizing the head related transfer functions (HRTFs) from a database, based on anatomical measurements.Details of algorithms for HRTF interpolation, room impulse response creation, HRTF selection from a database, and audio scene presentation are presented. Our system runs in real time on an office PC without specialized DSP hardware

    Plane-wave decomposition of a sound scene using a cylindrical microphone array

    No full text
    The analysis for microphone arrays formed by mounting microphones on a sound-hard spherical or cylindrical baf�e is typically performed using a decomposition of the sound field in terms of orthogonal basis functions. An alternative representation in terms of plane waves and a method for obtaining the coefficients of such a representation directly from measurements was proposed recently for the case of a spherical array. It was shown that representing the field as a collection of plane waves arriving from various directions simplifies both source localization and beamforming. In this paper, these results are extended to the case of the cylindrical array. Similarly to the spherical array case, localization and beamforming based on plane-wave decomposition perform as well as the traditional orthogonal function based methods while being numerically more stable. Both simulated and experimental results are presented. Index Terms — Acoustic fields, circular arrays, array signal processing, acoustic position measurement. 1

    VIRTUAL AUTOENCODER BASED RECOMMENDATION SYSTEM FOR INDIVIDUALIZING HEAD-RELATED TRANSFER FUNCTIONS

    No full text
    We propose a virtual autoencoder based recommendation system for learning a user’s Head-related Transfer Functions (HRTFs) without subjecting a listener to impulse response or anthropometric measurements. Autoencoder neural-networks generalize principal component analysis (PCA) and learn non-linear feature spaces that supports both out-of-sample embedding and reconstruction; this may be applied to developing a more expressive low-dimensional HRTF representation. One application is to individualize HRTFs by tuning along the autoencoder feature spaces. We demonstrate this new approach by developing a virtual (black-box) user that can localize sound from query HRTFs reconstructed from those spaces. Standard optimization methods tune the autoencoder features based on the virtual user’s feedback. Experiments with CIPIC HRTFs show that the virtual user can localize along out-of-sample directions and that optimization in the autoencoder feature space improves upon initial non-individualized HRTFs. Other applications of the representation are also discussed

    Gaussian process models for HRTF based 3D sound localization

    No full text
    The human ability to localize sound-source direction using just two receivers is a complex process of direction inference from spec-tral cues of sound arriving at the ears. While these cues can be de-scribed using the well-known head-related transfer function (HRTF) concept, it is unclear as to how densely HRTF must be sampled and whether a higher-order representation is employed in localization. We propose a class of binaural sound source localization models to answer these two questions. First, using the sound received by two ears, we derive several binaural features that are invariant to the sound source signal. Second, these are implicitly mapped to a high-dimensional reproducing kernel Hilbert space via a Gaussian pro-cess regression model for feature-direction tuples. Lastly, the fea-tures that are most relevant in the model are found via an efficient forward subset-selection method. Experimental results are shown for HRTFs belonging to the CIPIC database. Index Terms — Gaussian process regression, head-related trans-fer function, source cancellation algorithm, subset selectio
    corecore